List of Trained UNet Models

Ideas to be checked

  1. Frankle & Carbin (2018), Abstract – „Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy.”
  2. Nakkiran et al. (2019) – describes „a “double-descent” phenomenon where, as we increase model size, performance first gets worse and then gets better”.

Ideas checked during UNet development

  1. Ronneberger, Fischer, & Brox (2015) – „[…] presents a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently” (w naszym przypadku nie to jest najmocniejszą stroną tego algorytmu).

  2. Ioffe & Szegedy (2015), p. 5 – sectons 3.3 and 3.4: batch normalization enables higher learning rates and regularizes the model.

  3. Milletari, Navab, & Ahmadi (2016), p.6 – ,,[…] not need to assign weights to samples of different classes to establish the right balance between foreground and background voxels”

    • In medical volumes such as the ones we are processing in this work, it is not uncommon that the anatomy of interest occupies only a very small region of the scan. This often causes the learning process to get trapped in local minima of the loss function yielding a network whose predictions are strongly biased towards background. As a result the foreground region is often missing or only partially detected.
  4. Masters & Luschi (2018), Abstract – While the use of large mini-batches increases the available computational parallelism, small batch training has been shown to provide improved generalization performance and allows a significantly smaller memory footprint, which might also be exploited to improve machine throughput.

  5. Zhang, Bengio, Hardt, Recht, & Vinyals (2021), Abstract – Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small gap between training and test performance.

Multi-label balanced loss function:

mbdice_loss <- function(y_true, y_pred, epsilon=0) {
  intersection = keras::k_sum(y_true*y_pred, axis=c(2,3))
  union = keras::k_sum(y_true*y_true+y_pred*y_pred, axis=c(2,3))
  dice = keras::k_mean(2*intersection/(union+epsilon), axis=c(2,1))
  return(1 - dice)
}

See section 3 „Dice loss layer” in Milletari, Navab, & Ahmadi (2016), p.6 for 2-label definition.

ANTsRNet & R

  1. Grainger et al. (2018) – uses ANTsRNet UNet model and describes its architecture.

Training UNets

  1. Using early stopping: different epochs for SAT and VAT.
  2. Pro and Cons for mean losess.
training_labels <- c(
  loss = "Multilabel Balanced Dice Loss",
  sdice = "Multilabel Dice Score",
  background_dice = "Background Dice Score",
  vsat_dice = "VAT Dice Score",
  scat_dice = "SAT Dice Score"
)

Regular Data – 204 MR Images

<<<<<<< HEAD

Augumented Data – 404 MR Images

=======
mdice = "run/2021-03-10/unet_2d_mdice/fat_768x384x1/unet_2d_mdice_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24.csv"
tb <- readr::read_csv(mdice)
hist <- to_history(tb)
early_stoppings(hist, tolerance = 0.005) %>% arrange(metric, epoch)
plot_history(
  hist,
  lookup_table = training_labels,
  segments = early_stoppings(hist, tolerance = 0.005),
  base_size = 12
) +
  plot_annotation(
    title = 'The surprising truth about training UNets neural networks',
    subtitle = 'These 5 plots will reveal yet-untold secrets about our beloved data-set',
    caption = 'UNet NN: unet_2d_mdice_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24'
  )

mdice = "run/2021-03-02/unet_2d_mdice/fat_768x384x1/unet_2d_mdice_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24_DR40.csv"
tb <- readr::read_csv(mdice)
hist <- to_history(tb)

plot_history(
  hist,
  lookup_table = training_labels,
  segments = early_stoppings(hist, tolerance = 0.005),
  base_size = 12
) +
  plot_annotation(
    title = 'The surprising truth about training UNets neural networks',
    subtitle = 'These 5 plots will reveal yet-untold secrets about our beloved data-set',
    caption = 'UNet NN: unet_2d_mdice_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24_DR20'
  )

Augumented Data – 404 MR Images

mdice_ag = "run/2021-03-01/unet_2d_mdice_ag/fat_768x384x1/unet_2d_mdice_ag_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24.csv"
tb_ag <- readr::read_csv(mdice_ag)
hist_ag <- to_history(tb_ag)

plot_history(
  hist_ag,
  lookup_table = training_labels,
  segments = early_stoppings(hist_ag, tolerance = 0.005),
  base_size = 14
) +
  plot_annotation(
    title = 'The surprising truth about training UNets neural networks',
    caption = mdice_ag
  )

mdice_ag = "run/2021-03-02/unet_2d_mdice_ag/fat_768x384x1/unet_2d_mdice_ag_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24_DR20.csv"
tb_ag <- readr::read_csv(mdice_ag)
hist_ag <- to_history(tb_ag)

plot_history(
  hist_ag,
  lookup_table = training_labels,
  segments = early_stoppings(hist_ag, tolerance = 0.005),
  base_size = 12
) +
  plot_annotation(
    title = 'The surprising truth about training UNets neural networks',
    subtitle = 'These 5 plots will reveal yet-untold secrets about our beloved data-set',
    caption = 'UNet NN: unet_2d_mdice_ag_AdaDelta_fat_768x384x1_N204_E24_L4_FBL24_DR20'
  )

>>>>>>> ac7571fe77695f5e98e7df5f343f711bf86f24bf

References

Frankle, J., & Carbin, M. (2018). The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. arXiv e-Prints, arXiv:1803.03635. Retrieved from https://arxiv.org/abs/1803.03635
Grainger, A. T., Tustison, N. J., Qing, K., Roy, R., Berr, S. S., & Shi, W. (2018). Deep learning-based quantification of abdominal fat on magnetic resonance images. PLOS ONE, 13(9), e0204071. https://doi.org/10.1371/journal.pone.0204071
Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv e-Prints, arXiv:1502.03167. Retrieved from https://arxiv.org/abs/1502.03167
Masters, D., & Luschi, C. (2018). Revisiting Small Batch Training for Deep Neural Networks. arXiv e-Prints, arXiv:1804.07612. Retrieved from https://arxiv.org/abs/1804.07612
Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. arXiv e-Prints, arXiv:1606.04797. Retrieved from https://arxiv.org/abs/1606.04797
Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B., & Sutskever, I. (2019). Deep Double Descent: Where Bigger Models and More Data Hurt. arXiv e-Prints, arXiv:1912.02292. Retrieved from https://arxiv.org/abs/1912.02292
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv e-Prints, arXiv:1505.04597. Retrieved from https://arxiv.org/abs/1505.04597
Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2021). Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3), 107–115. https://doi.org/10.1145/3446776